Towards Efficient Discovery of Frequent Patterns with Relative Support
نویسندگان
چکیده
Frequent patterns are an important class of regularities that exist in a database. Although there exists no universally acceptable best measure to assess the interestingness of a pattern, relative support is emerging as a popular measure to discover frequent patterns involving both frequent and rare items. An Apriori-like algorithm known as Relative Support Apriori (RSA) has been discussed in the literature to discover the patterns. It has been observed that mining frequent patterns with RSA is a computationally expensive process because the discovered patterns do not satisfy the anti-monotonic property. Moreover, RSA also suffers from the performance problems involving generation of the huge number of candidate patterns and multiple scans on the database. This paper makes an effort to discover frequent patterns effectively with the relative support measure. To reduce the computational cost, we theoretically show that the patterns discovered with the relative support measure satisfy the convertible anti-monotonic property. Using this property, a pattern-growth algorithm known as Relative Support Frequent Pattern-growth (RSFP-growth) has been proposed to discover the patterns. Experimental results on both synthetic and real-world datasets show that the proposed RSFP-growth algorithm is significantly better than the RSA algorithm.
منابع مشابه
Efficient Candidacy Reduction For Frequent Pattern Mining_ final03
Certainly, nowadays knowledge discovery or extracting knowledge from large amount of data is a desirable task in competitive businesses. Data mining is a main step in knowledge discovery process. Meanwhile frequent patterns play central role in data mining tasks such as clustering, classification, and association analysis. Identifying all frequent patterns is the most time consuming process due...
متن کاملDiscovering Periodic-Frequent Patterns in Transactional Databases
Since mining frequent patterns from transactional databases involves an exponential mining space and generates a huge number of patterns, efficient discovery of user-interest-based frequent pattern set becomes the first priority for a mining algorithm. In many real-world scenarios it is often sufficient to mine a small interesting representative subset of frequent patterns. Temporal periodicity...
متن کاملSLPMiner: An Algorithm for Finding Frequent Sequential Patterns Using Length-Decreasing Support Constraint
Over the years, a variety of algorithms for finding frequent sequential patterns in very large sequential databases have been developed. The key feature in most of these algorithms is that they use a constant support constraint to control the inherently exponential complexity of the problem. In general, patterns that contain only a few items will tend to be interesting if they have a high suppo...
متن کاملBAMBOO: Accelerating Closed Itemset Mining by Deeply Pushing the Length-Decreasing Support Constraint
Previous study has shown that mining frequent patterns with length-decreasing support constraint is very helpful in removing some uninteresting patterns based on the observation that short patterns will tend to be interesting if they have a high support, whereas long patterns can still be very interesting even if their support is relatively low. However, a large number of non-closed (i.e., redu...
متن کاملParallel Regular-Frequent Pattern Mining in Large Databases
Mining interesting patterns in various domains is an important area in data mining and knowledge discovery process. A number of parallel and distributed frequent pattern mining algorithms have been proposed so far for the large and/or distributed databases. Occurrence frequency is not the only criteria to mine the patterns but also occurrence behavior (regularity) of a pattern may also be inclu...
متن کامل